Readability-based Sentence Ranking for Evaluating Text Simplification

نویسندگان

  • Sowmya Vajjala
  • Walt Detmar Meurers
چکیده

We propose a new method for evaluating the readability of simplified sentences through pair-wise ranking. The validity of the method is established through incorpus and cross-corpus evaluation experiments. The approach correctly identifies the ranking of simplified and unsimplified sentences in terms of their reading level with an accuracy of over 80%, significantly outperforming previous results. To gain qualitative insights into the nature of simplification at the sentence level, we studied the impact of specific linguistic features. We empirically confirm that both word-level and syntactic features play a role in comparing the degree of simplification of authentic data. To carry out this research, we created a new sentence-aligned corpus from professionally simplified news articles. The new corpus resource enriches the empirical basis of sentence-level simplification research, which so far relied on a single resource. Most importantly, it facilitates cross-corpus evaluation for simplification, a key step towards generalizable results.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessing the Readability of Sentences: Which Corpora and Features?

The paper investigates the problem of sentence readability assessment, which is modelled as a classification task, with a specific view to text simplification. In particular, it addresses two open issues connected with it, i.e. the corpora to be used for training, and the identification of the most effective features to determine sentence readability. An existing readability assessment tool dev...

متن کامل

Readability Assessment for Text Simplification: From Analyzing Documents to Identifying Sentential Simplifications

Readability assessment can play a role in the evaluation of a simplification algorithm as well as in the identification of what to simplify. While some previous research used traditional readability formulas to evaluate text simplification, there is little research into the utility of readability assessment for identifying and analyzing sentence level targets for text simplification. We explore...

متن کامل

Sentence Simplification by Monolingual Machine Translation

In this paper we describe a method for simplifying sentences using Phrase Based Machine Translation, augmented with a re-ranking heuristic based on dissimilarity, and trained on a monolingual parallel corpus. We compare our system to a word-substitution baseline and two state-of-the-art systems, all trained and tested on paired sentences from the English part of Wikipedia and Simple Wikipedia. ...

متن کامل

Assessing the relative reading level of sentence pairs for text simplification

While the automatic analysis of the readability of texts has a long history, the use of readability assessment for text simplification has received only little attention so far. In this paper, we explore readability models for identifying differences in the reading levels of simplified and unsimplified versions of sentences. Our experiments show that a relative ranking is preferable to an absol...

متن کامل

Simple, readable sub-sentences

We present experiments using a new unsupervised approach to automatic text simplification, which builds on sampling and ranking via a loss function informed by readability research. The main idea is that a loss function can distinguish good simplification candidates among randomly sampled sub-sentences of the input sentence. Our approach is rated as equally grammatical and beginner reader appro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1603.06009  شماره 

صفحات  -

تاریخ انتشار 2016